NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Passive Snapshot Coded Aperture Dual-Pixel RGB-D Imaging

Ghanekar, Bhargav; Khan, Salman S; Sharma, Pranav; Singh, Shreyas; Boominathan, Vivek; Mitra, Kaushik; Veeraraghavan, Ashok (March 2024, ArXiv)

Passive, compact, single-shot 3D sensing is useful in many application areas such as microscopy, medical imaging, surgical navigation, and autonomous driving where form factor, time, and power constraints can exist. Obtaining RGB-D scene information over a short imaging distance, in an ultra-compact form factor, and in a passive, snapshot manner is challenging. Dual-pixel (DP) sensors are a potential solution to achieve the same. DP sensors collect light rays from two different halves of the lens in two interleaved pixel arrays, thus capturing two slightly different views of the scene, like a stereo camera system. However, imaging with a DP sensor implies that the defocus blur size is directly proportional to the disparity seen between the views. This creates a trade-off between disparity estimation vs. deblurring accuracy. To improve this trade-off effect, we propose CADS (Coded Aperture Dual-Pixel Sensing), in which we use a coded aperture in the imaging lens along with a DP sensor. In our approach, we jointly learn an optimal coded pattern and the reconstruction algorithm in an end-to-end optimization setting. Our resulting CADS imaging system demonstrates improvement of >1.5dB PSNR in all-in-focus (AIF) estimates and 5-6% in depth estimation quality over naive DP sensing for a wide range of aperture settings. Furthermore, we build the proposed CADS prototypes for DSLR photography settings and in an endoscope and a dermoscope form factor. Our novel coded dual-pixel sensing approach demonstrates accurate RGB-D reconstruction results in simulations and real-world experiments in a passive, snapshot, and compact manner.
more » « less
Full Text Available
FlatNet3D: intensity and absolute depth from single-shot lensless capture

https://doi.org/10.1364/JOSAA.466286

Bagadthey, Dhruvjyoti; Prabhu, Sanjana; Khan, Salman_S; Fredrick, D_Tony; Boominathan, Vivek; Veeraraghavan, Ashok; Mitra, Kaushik (September 2022, Journal of the Optical Society of America A)

Lensless cameras are ultra-thin imaging systems that replace the lens with a thin passive optical mask and computation. Passive mask-based lensless cameras encode depth information in their measurements for a certain depth range. Early works have shown that this encoded depth can be used to perform 3D reconstruction of close-range scenes. However, these approaches for 3D reconstructions are typically optimization based and require strong hand-crafted priors and hundreds of iterations to reconstruct. Moreover, the reconstructions suffer from low resolution, noise, and artifacts. In this work, we proposeFlatNet3D—a feed-forward deep network that can estimate both depth and intensity from a single lensless capture. FlatNet3D is an end-to-end trainable deep network that directly reconstructs depth and intensity from a lensless measurement using an efficient physics-based 3D mapping stage and a fully convolutional network. Our algorithm is fast and produces high-quality results, which we validate using both simulated and real scenes captured using PhlatCam.
more » « less
FlatNet: Towards Photorealistic Scene Reconstruction from Lensless Measurements

https://doi.org/10.1109/TPAMI.2020.3033882

Khan, Salman Siddique; Sundar, Varun; Boominathan, Vivek; Veeraraghavan, Ashok; Mitra, Kaushik (October 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence)
null (Ed.)
Lensless imaging has emerged as a potential solution towards realizing ultra-miniature cameras by eschewing the bulky lens in a traditional camera. Without a focusing lens, the lensless cameras rely on computational algorithms to recover the scenes from multiplexed measurements. However, the current iterative-optimization-based reconstruction algorithms produce noisier and perceptually poorer images. In this work, we propose a non-iterative deep learning-based reconstruction approach that results in orders of magnitude improvement in image quality for lensless reconstructions. Our approach, called FlatNet, lays down a framework for reconstructing high-quality photorealistic images from mask-based lensless cameras, where the camera's forward model formulation is known. FlatNet consists of two stages: (1) an inversion stage that maps the measurement into a space of intermediate reconstruction by learning parameters within the forward model formulation, and (2) a perceptual enhancement stage that improves the perceptual quality of this intermediate reconstruction. These stages are trained together in an end-to-end manner. We show high-quality reconstructions by performing extensive experiments on real and challenging scenes using two different types of lensless prototypes: one which uses a separable forward model and another, which uses a more general non-separable cropped-convolution model. Our end-to-end approach is fast, produces photorealistic reconstructions, and is easy to adopt for other mask-based lensless cameras.
more » « less
Full Text Available
CANOPIC: Pre-Digital Privacy-Enhancing Encodings for Computer Vision

https://doi.org/10.1109/ICME46284.2020.9102956

Tan, Jasper; Khan, Salman S.; Boominathan, Vivek; Byrne, Jeffrey; Baraniuk, Richard; Mitra, Kaushik; Veeraraghavan, Ashok (July 2020, CANOPIC: Pre-Digital Privacy-Enhancing Encodings for Computer Vision)
null (Ed.)
The standard pipeline for many vision tasks uses a conventional camera to capture an image that is then passed to a digital processor for information extraction. In some deployments, such as private locations, the captured digital imagery contains sensitive information exposed to digital vulnerabilities such as spyware, Trojans, etc. However, in many applications, the full imagery is unnecessary for the vision task at hand. In this paper we propose an optical and analog system that preprocesses the light from the scene before it reaches the digital imager to destroy sensitive information. We explore analog and optical encodings consisting of easily implementable operations such as convolution, pooling, and quantization. We perform a case study to evaluate how such encodings can destroy face identity information while preserving enough information for face detection. The encoding parameters are learned via an alternating optimization scheme based on adversarial learning with deep neural networks. We name our system CAnOPIC (Camera with Analog and Optical Privacy-Integrating Computations) and show that it has better performance in terms of both privacy and utility than conventional optical privacy-enhancing methods such as blurring and pixelation.
more » « less
Full Text Available

Search for: All records